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In order to compress quantum messages without loss of information it is necessary to allow the 
length of the encoded messages to vary. We develop a general framework for variable-length quantum 
messages in close analogy to the classical case and show that lossless compression is only possible if 
the message to be compressed is known to the sender. The lossless compression of an ensemble of 
messages is bounded from below by its von-Neumann entropy. We show that it is possible to reduce 
the number of qbits passing through a quantum channel even below the von-Neumann entropy by 
adding a classical side-channel. We give an explicit communication protocol that realizes lossless 
and instantaneous quantum data compression and apply it to a simple example. This protocol can 
be used for both online quantum communication and storage of quantum data. 



I. INTRODUCTION 



Any physical system can be considered as a carrier of 
information because the state of that system could in 
principle have been intentionally manipulated to repre- 
sent a message. The state of a system composed from 
distinguishable subsystems forms a message of a certain 
length, where each subsystem represents one letter. In 
quantum information theory, the systems are quantum 
and the system states represent quantum messages. A 
message is compressed if it is mapped to a shorter mes- 
sage and if this map is reversible, then no information has 
been lost. Schumacher was the first to present a method 
for quantum data compression jj| . It is based on the con- 
cept of encoding only a typical subspace spanned by the 
typical sequences emitted by a memoryless source. Since 
then there have been further investigations , but all 
considered compression methods are only faithful in the 
limit of large block lengths. Now we ask: Is it possible 
to compress quantum messages without any loss of in- 
formation? To answer this question some basic concepts 
of quantum information theory have to be revisited. In 
particular, the requirement of a fixed block length for 
quantum messages has to be abandoned and must be re- 
placed by a more general theory of quantum messages 
which enables a flexible and easy treatment of quantum 
codes involving codewords of variable- length. At first, we 
develop a general framework in close analogy to the clas- 
sical case, based on previous work by one of us ]9|,|l0|]. A 
different approach to variable-length quantum messages 
(appearing as a special case in our formalism) has been 
worked out by Braunstein et al. [[| and Schumacher and 



Westmoreland [g . We define a measure of information 
quantifying the effort of communication. Compression 
then means reducing this effort. We argue that prefix 
codes are practically not very useful for quantum coding 
and suggest a different method involving an additional 
classical side-channel. With the help of this channel, cer- 
tain problems of instantaneous quantum communication 
can be avoided and, moreover, the quantum channel can 
be used with higher efficiency. At last, we present a com- 
munication protocol that enables lossless and instanta- 
neous quantum data compression and we demonstrate 
its efficiency by an explicit example. Let us start with 
reviewing the fundamental notion of a code 1 . 



message set A + 



source set Q 




FIG. 1. A classical code is a map from a set of source ob- 
jects into a set of codewords composed from an alphabet. An 
ensemble of source objects is mapped to an ensemble of code- 
words. For variable-length codes, the length of the codewords 
is allowed to vary. 



1 Some notions and definitions are already existing, some are 
based on our own reasoning. When we find an already exist- 
ing definition equal or similiar to the desired one, we use it 
and in case it is not a standard definition, we give an explicit 
reference. For a profound review on classical information the- 
ory, see [ pd|Jl^1 , for a profound review on quantum information 
theory, see ]13| , [l4| . 
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II. CODES 

Basically, when you have a set of things and you want 
to give them a name, then this is a coding task. There is 
a code for bank accounts, telephone devices and inhabi- 
tants of a country, there even is a code for living beings: 
the genetic code. Language is a code for thoughts, which 
are in turn codes for abstract ideas or concrete objects 
of human experience. A code gives meaning to a mes- 
sage, it relates objects to their description. Objects are 
encoded into messages composed from a basic alphabet. 
The number of letters that is needed to describe a partic- 
ular object is a good measure of the information content 
given to the object by the code. This is the key to data 
compression which we will study in the following with a 
focus on quantum codes. 

Classically, a code is a map c : Q — ► M from a set of 
objects, n, to a set of messages, M (see Fig. |l]). It is the 
messages that can be communicated and not the objects 
themselves, so communication is always based on a code. 
Messages (or strings) are sequences of letters taken from 
an alphabet A and are denoted by X • — X \ ' ' ' X ft j X<i kz 
A. The empty message is denoted by a; := 0. All mes- 
sages of length n form the set 

A n := {x n | Xl £ A}, (1) 

and the empty message forms the set A := {0}. All 
strings of finite length form the set of general messages 
over the alphabet A, 

00 

A+ := (J A n . (2) 

71=0 

Every subset M C A + is a message set. Now we can pre- 
cisely define a classical k-ary code as a map c : f2 — ► A + 
with k :— \A\. The set C = c(f2) is the codebook and 
each member of C is a codeword. Being a subset of A + , 
a codebook is also a message set (just like a nightingale is 
also a bird). If C C A n for some n € N, then c is called a 
Wocfc code, otherwise a variable-length code. There is an- 
other important classification: lossless and lossy codes. 
A code is lossless (or uniguely decodable or non- singular), 
if there are distinct codewords for distinct objects, i.e. 
Vx, y G fi : i^y=> c(x) 7^ c(y). In case of a /ossy code, 
some objects are mapped to the same encoding. Lossy 
codes are used when it is more important to reduce the 
size of the message than to ensure the correct decoding 
(a fine example is the MP3 code for sound data). For a 
given probability distribution on f2, lossy codes can also 
be useful if the fidelity F, i.e. the probability of correct 
decoding, is close to 1. For lossless codes the fidelity is 
exactly 1. In this paper, we only consider lossless codes. 

A. The general message space 

The transition from classical to quantum information 
is simple. We just allow the elements of a source set f2 



to be in superposition. Precisely, we interpret fl as an 
orthonormal basis for a Hilbert space V and consider ev- 
ery normalized vector of V as a valid object. Then V 
is the linear span of O and we write V = Span(Sl) with 
dimV = |f2|. The same goes for the messages. We in- 
terpret a message set M as an orthonormal basis for a 
message space M. = Span(M) with dimAl = \M\ and 
consider each element of M. as a valid message. The 
map c : V — > M. then represents a quantum code with 
the space C = c(V) being the code space and the ele- 
ments of C being the codewords. In order to preserve 
linearity, the code must be a linear map and in order 
to preserve norm, the code must be an isometric map. 
In the literature, often the code space C rather than the 
map c is called a code (this is a bit like calling f(x) a 
function). However, by saying "code" we will refer to the 
map c here, in full analogy to the classical case. Now 
let us find the general message space corresponding to 
the classical general message set A + . Interpret the let- 
ters of a quantum alphabet Q as an orthonormal basis 
for a letter space H := Span(Q). A letter space Tt with 
k = dim7Y = \ Q\ is called a k-ary space. Quantum letters 
are composed into messages by tensor multiplication, giv- 
ing product messages \x n ) :— \x%) ® • • • ® \x n ) that form 
the set Q n := {\x n ) \ \xi) <G Q} and span the block space 
Tt® n := Span(Q"), giving 

n 

U® n = (g)H = H®---®H. (3) 

i=l 

The space 7^®" is the quantum analogue to the set A n 
of classical block messages given by (Q), and contains 
arbitrary superpositions of product messages, which are 
called entangled messages. Because superposition and 
entanglement have no classical interpretation, quantum 
information is truly different from classical information. 
The empty message, denoted by |a; ) = |0), forms the 
set Q° = {|0)} and spans the one-dimensional space 
H®° := Span(Q°). Elements of 7i® n for some n G IN 
are called block messages. The set of all product mes- 
sages composed from Q is denoted by Q + :— U^Lo 2"- 
Now the general message space H.® induced by 7i can be 
defined by TL® := Span(Q + ), giving 

00 

H B = H® n = H®° © H © H® 2 © • • • . (4) 

n=0 

The space Ti® is the quantum analogue to the set A + 
of general classical messages given by @ . TL® is a sep- 
arable Hilbert space with the countable basis Q + . The 
space 7i® is similiar to the Fock space in many-particle 
theory, except that the particles are letters here, which 
must be distinguishable, so there is no symmctrization 
or antisymmetrization. The general message space con- 
tains also superpositions of messages of distinct length, 
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for example 

-^(|ioi) + liiioo)) en®, (5) 

if |Q), |1) G H. Any block space H 8n is a subspace of TL® 
and is orthogonal to any other block space 7i®" 1 with 
n^ra. Elements with components of distinct length are 
called variable-length messages (or indeterminate-length 
messages) to distinguish them from block messages. Any 
subspace M. C 7i® is called a message space and its ele- 
ments are quantum messages. 

B. Length operator 

Define the length operator in TL® measuring the length 
of a message as 

oo 

Z:=]TnIT n , (6) 

where U n is the projector on the block space Ji® 11 c H®, 
given by 

n n = £ \x n ){x n \. (7) 

As L is a quantum observable, the length of a message 
| or) G H.® is generally not sharply defined. Rather, the 
measurement of L generally disturbs the message by pro- 
jecting it on a block space of the corresponding length. 
The expected length of a message \x) G 7i® is given by 

L(x) := (x\L\x). (8) 

However, in Ti® there are also messages whose expected 
length is infinite. Classical analoga are probability dis- 
tributions with non-existing moments, e.g. the Lorentz 
distribution. Block messages are eigenvectors of L, that 
is, L\x) = n \x) for all |x) G H® n . 

The generalization to statistical ensembles is straight- 
forward. Consider an ensemble E = {p, X} of variable- 
length messages |x) G X C 7i® occurring with probabil- 
ity p(x) > V|x) G X such that J2 x exP( x ) = Then 
there is a density operator 

<t=^2p(x)\x)(x\, (9) 

called a statistical quantum message, representing the en- 
semble S. The set of all such density operators is denoted 
by S(H®). Vice versa, however, for a given density op- 
erator a E S(H®) there is in general a non-countable 
set of corresponding ensembles. In terms of information 
theory, a cannot be regarded as a lossless code for the 
ensemble E. There is more information in the ensemble 
than in the corresponding density operator. As we will 



see, this additional a priori knowledge is in fact needed 
to make lossless compression possible. 

The expected length of an ensemble E or of the corre- 
sponding statistical message a G S(7i®) is defined as 

L(E) = L(a) := Tr{a L} = ^ p(x) L(x). (10) 

C. Base length 

The expected length of a quantum message |x), given 
by (||), will in general not be the outcome of a length 
measurement. Every length measurement results in one 
of the length eigenvalues supported by |x) and generally 
disturbs the message. If there is a maximum value re- 
sulting from a length measurement of a state \x), namely 
the length of the longest component of |x), then let us 
call it the base length of |x), defined as 

L(x) := max{n G N | (x|II n |x) > 0}. (11) 

For example, the quantum message 

|x) = —={\abra) + \cadabra)) (12) 
v2 

has base length 7. Since the base length of a state is the 
size of its longest component, we have 

L{x) > L(x). (13) 

It is important to note that the base length is not an ob- 
servable. It is only available if the message |x) is a priori 
known. 



D. Quantum code 

Now we can precisely define a k-ary quantum code to 
be a linear isometric map c : V — > TiP, where V is a 
Hilbert space and Tt® is the general message space in- 
duced by a letter space Tt of dimension k. The image of 
V under c is the code space C = c(V) (see Fig. |^) . 



message space W 
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FIG. 2. A quantum code is a linear isometric map from a 
source space of quantum objects into a code space of code- 
words composed from a quantum alphabet. Superpositions of 
source objects are encoded into superpositions of codewords. 
An ensemble of source objects is mapped to an ensemble of 
codewords. For a variable-length quantum code, the length 
of the codewords is allowed to vary. Superpositions of code- 
words of distinct length lead to codewords of indeterminate 
length. The base length of a codeword is defined as the length 
of the longest component. 

Being a quantum analogue to the codebook, C is the 
space of valid codewords. The code c is uniquely specified 
by the transformation rule 



H ^ |7>, 



(14) 



where \oj) are elements of a fixed orthonormal basis By 
of V and I7) = |c(o>)) are elements of an orthonor- 
mal basis Be of C. Since c is an isometric map, i.e. 
(u)\u)') — (c(uj)\c(uj')}, this implies that |c(w)) 7^ |c(w')) 
for all \u>) 7^ \oj') in V, so c is a lossless code with an 
inverse c _1 . The quantum code c can be represented by 
the isometric operator 

C:= ]T \c(w))(w\= ]T h)(c-\7)\, (15) 

called the encoder of c. Since c is lossless, there is an 
inverse operator 



D := C- 1 = > \u> 



E 

■yeB v 



}<c(u,)|= J2 Ic _1 (7)><7|, (16) 



called the decoder. In practice, the source space V and 
the code space C are often subspaces of one and the same 
physical space 1Z. Since C is an isometric operator be- 
tween V and C, there is a (non-unique) unitary extension 
Uc on 1Z with 



U c \x)=C\x), V|x)eVCK, 
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Ul\y)=C- 1 \y), V\y)eC(lK. 



(17) 
(18) 



However, using C and distinguishing between V and C is 
more convenient and more general. Codes with C C H® n 
for some n £ N are called block codes, otherwise variable- 
length codes. 



III. REALIZING VARIABLE-LENGTH 
MESSAGES 

Variable-length messages could in principle directly be 
realized by a quantum system whose particle number 
is not conserved, for instance, an electromagnetic field. 
Each photon may carry letter information by its field 
mode, while the number of photons may represent the 
length of the message. The photons can be ordered ei- 
ther using their spacetime position (e.g. single photons 



running through a wire) or some internal state with many 
degrees of freedom (e.g. a photon with frequency U02 can 
be defined to "follow" a photon with frequency lo\ < io?). 
The Hilbert space representing such a system of distin- 
guishable particles with non-conserved particle number 
simply is the message space 7i® . In case we have only 
a system at hand, where the number of particles is con- 
served, we can also realize variable-length messages by 
embedding them into block spaces. 

It is a good idea to distinguish between the message 
space, which is a purely abstract space, from its physi- 
cal realization. Let us call the physical realization of a 
message space M. the operational space M.. Between M. 
and M, there is an isometric map, so dim.M = dim.M. 
This is expressed by M. = M . The operational space M 
is the space of physical states of a system representing 
valid codewords of M. Often the operational space is a 
subspace of the total space of all physical states of the 
system. Denoting the total physical space by 1Z we have 



M = M C K. 



A. Bounded message spaces 



(19) 



The general message space H® is the "mother" of all 
message spaces induced by the letter space H. It con- 
tains just every quantum message that can be composed 
using letters from H and the laws of quantum mechanics. 
However, it is an abstract space, i.e. independent from a 
particular physical implementation. It would be good to 
know if such a space can also physically be realized. It is 
clear that if you have a finite system you can only real- 
ize a finite dimensional subspace of the general message 
space, whose dimension is infinite. So let us start with 
the physical realization of the r-bounded message space 



(20) 



n=0 



containing all superpositions of messages of maximal 
length r. 

Say you have a physical space 1Z = V® s representing 
a register consisting of s systems with dim I? = k. Each 
subspace T> represents one quantum digit in the register. 
In the case k = 2 the quantum digits are quantum bits, in 
short "qbits" . The physical space 1Z represents the space 
of all physical states of the register, while the message 
space 7i® r represents the space of valid codewords that 
can be held by the register and it is isomorphic to a sub- 
space Tt® r of the physical space TZ. Let dimH = k, then 
you must choose s such that 



dim(H® r ) < dim^ 5 ) 



^ k r+1 - 1 
^ k" — • 

n=0 



k 



s > r + 1. 



(21) 
(22) 
(23) 
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Thus you need a register of at least (r + 1) digits to 
realize the message space TC® r . Choose the smallest pos- 
sible register space TZ = 2?®( r+1 ). Since at most r digits 
are carrying information, one digit can be used to in- 
dicate either the beginning or the end of the message. 
Now you can conveniently use k-ary representations of 
natural numbers as codewords. Each natural number i 
has a unique /c-ary representation Z^(i). For instance, 
•2f 2 (3) = 11 andZi 6 (243) = E3. All /c-ary representations 
have a neutral prefix "0" that can precede the represen- 
tation without changing its value, e.g. 000011 = 11. For 
a natural number n > 0, define ZJ}(i) as the n-extended 
k-ary representation of i by 



Z%(i) :=0---0Z k (i), 0<i<k r -l. 



(24) 



For example, Zf (3) = 000011 and Zf 6 (243) = 0000S3. 
Let us define that the message starts after the first ap- 
pearance of "1", e.g. 000102540 ^ 02540. Now define 
orthonormal vectors 



lo-.-oi^)) en 



(25) 



where n > and < i < k n — 1. The n digits of Z^(i) are 
called significant digits. The empty message corresponds 
to the unit vector 



Obviously, 



| ) := |e°) :=|0---01). 
has no significant digits. 

register 



(26) 
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redundant digits | "| significant digits 
start digit 

FIG. 3. Realizing a general variable-length message. 

Next, define orthonormal basis sets 

8»:={|eS>,... 0<n<r, (27) 

that span the operational block spaces 

H® n = Span(6"). (28) 

Note that ii® n is truly different from H® n , because H® n 
has dimension k r+1 , while TiP n has dimension k n . Next, 
define an orthonormal basis 



8+ := (J &■ 



(29) 



71=0 



and construct the operational space Ti® r C TZ by 

H® r := Span(e+). (30) 



Altogether, the physical space TZ = T>®^ r+1 ^ is the space 
of all physical states of the register, while the operational 
space H® r C TZ is the space of those register states that 
represent valid codewords, and it is isomorphic to the 
abstract message space Tt® r . 

A general message is represented by the vector 



k n -i 



2. 



n=0 i=0 



K) 



(31) 



witn 5X=oSi=o 1 \ xn,i \ 2 = 1- The length operator 
troduced in section IIIBl is here of the form 



L := 2J nU n , 



(32) 



n=0 



because there are at most r digits to constitute a mes- 
sage. Now we need to know how the projectors II n are 
constructed in the operational space 7Y® r . For a register 
state containing a message of sharply defined length, the 
length eigenvalue n is given by the number of significant 
digits in that register, 

£|e?):=n|e?), (33) 

for < i < k n — 1. Each projector is then defined by 



II n := £ |e?)(ef 



(34) 



and projects onto the space W®" C TZ. Note that the 
physical length of each message is always given by the 
fixed size (r + 1) of the register. Only the significant 
length of a message, i.e. the number of digits that consti- 
tute a message contained in the register, is in general not 
sharply defined. Note further that the particular form 
of the length operator depends on the realization of the 
message space. 

In the limit of large r we have lim Ti® r = H®, but 

r — >oo 

that space can no longer be embedded into a physical 
space TZ = V 000 := lim V® n , since the latter is no sep- 

n — >oo 

arable Hilbert space anymore. However, we can think 
of r as very large, such that working in H® just means 
working with a quantum computer having enough mem- 
ory. 

B. Realizing more message spaces 

A code is a map c : V — > Ti® from source states in V 
to codewords in HP. The space C — c(V) of all code- 
words is the code space and as a subspace of the general 
message space Ti® it is just a special message space. In 
order to implement a particular code c, it is in practice 
sufficient to realize only the corresponding code space C 
by a physical system. Let us realize some important code 
spaces now. However, we will not discuss the very im- 
portant class of error- correcting code spaces here, since 
this would go beyond the scope of this paper. 
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1. Block spaces 



3. Neutral-prefix space 



An important message space is the block space 7{® n , 
that contains messages of fixed length n. Block spaces 
are the message spaces of standard quantum informa- 
tion theory. They can directly be realized by a register 
1Z = Ji® n of n digits, e.g. n two-level systems represent- 
ing one qbit each. 



2. Prefix spaces 

Another interesting message space is the space of pre- 
fix codewords of maximal length r. Such a space con- 
tains only superpositions of prefix codewords. A set of 
codewords is prefix (or prefix- free), if no codeword is 
the prefix of another codeword. For example, the set 
P3 = {0, 10, 110, 111} is a set of binary prefix codewords 
of maximal length 3. Prefix codewords have one signifi- 
cant advantage: 

• Prefix codewords are instantaneous, that is, se- 
quences of prefix codewords do not need a word 
separator. The separator can be added while read- 
ing the sequence from left to right. A sequence from 
P3 can be separated like 

110111010110^ 110,111,0,10,110. (35) 

However, there is also a drawback: 

• Prefix codewords are in general not as short as pos- 
sible. 

This is a consequence of the fact that there are in gen- 
eral less prefix codewords than possible codewords. For 
example, if you want to encode 4 different objects, you 
can use the prefix set P3 above with maximal length 3. 
If you renounce the prefix property you can use the set 
{0, 1, 01, 10} with maximal length 2. 

A prefix space V r of maximal length r is given by 
the linear span of prefix codewords of maximal length 
r. For the set P3, the corresponding prefix space is 
V 3 = Span{|0),|10),|110),|lll)}. The prefix space 
V r C H® 7 " can physically be realized by a subspace V r 
of the register space 1Z = T>® r spanned by the pre- 
fix codewords which have been extended by zeroes at 
the end to fit them into the register. For example, 
V 3 = Span{|000), |100), |110), |111)} C V® 3 is a physical 
realization of the prefix space V3. The length operator 
measures the significant length of the codewords, given 
by the length of the corresponding prefix codewords. 

Schumacher and Westmoreland || as well as Braun- 
stein et al. || used prefix spaces for their implementa- 
tion of variable-length quantum coding. However, we 
will show later on that the significant advantage of prefix 
codewords in fact vanishes in the quantum case, whereas 
the disadvantage remains. 



A specific code space will be of interest, namely the 
space of neutral-prefix codewords, which we define as fol- 
lows. The fc-ary representati on of a natural number i is 
denoted by Z k {i) (see section [II A ). The empty message 
is represented by Z k (0) = 0. Define an orthonormal 
basis 

B r :={\Z k (0)),... ,\Z k (k r ~l)}} (36) 

of variable-length messages of maximal length r. The 
length of each basis message \Z k (i)) is given by 



\Z k (i)\ = \log k (i + l)], 



(37) 



where \x\ denotes the smallest integer > x. These basis 
messages span the r -bounded neutral-prefix space 



Mr := Span(B r ). 



(38) 



Note that M r is not equal to the r-bounded message 
space 7i® r as you can see by comparing the dimension 

Br _ fc r+1 -l 



dim AC = k r with dim7Y 9 



fc-i 



M r is smaller than 



7i® r , because not all messages of 7i ffir are contained in 
M r - For example, the message 1 01) is in 7i® r but not in 
M r , hence we have 



Mr C n Q 



(39) 



Now we want to find a physical realization of M r - This 
turns out to be quite easy (see Fig. ||). 

register 
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redundant digits | significant digits 

FIG. 4. Realizing variable-length messages by neu- 
tral-prefix codewords. 



As already noted in section III A , the fc-ary represen- 
tation Zk(i) of any natural number i can be extended 
by leading zeroes to the r-extended fc-ary representation 
Z r k {i) := • • • 0Z k (i). Take a register K = V® r of r digits 
with V = <D k . Then the set 



B 



K 



{K r (o)),... ,\z r k (k r -i))} 



(40) 



is an orthonormal basis for the register space 1Z. At the 
same time it can be regarded as an orthonormal basis for 
the operational space M r representing the neutral-prefix 
space M r - While the physical length of each codeword 
is constantly r, the significant length is measured by the 
length operator 



L := n n r , 



(41) 



G 



with mutually orthogonal projectors 

n„:= £ \zm(z r k (i)\. 

i: \Z k (i)\=n 



(42) 



Note that the so-dehncd length oper ator looks different 
from the one defined in section III A . While L is always 
of the same form (|32"|), the projectors H n are different 
because the operational spaces are different. 
The empty message can be defined by 



|0} :=|^(0)) = |0-.-0). 
A general message in J\f r is given by 



fc r -i 

£ 



X-i 



(43) 



(44) 



We have realized the neutral-prefix space Af r by exhaust- 
ing the entire register space 1Z, so the quantum resources 
are optimally used. In other words: 

• All messages in N r are as short as possible. 

Remember that the physical realization of TL® r requires 
one additional digit to represent the beginning or the end 
of a message. This digit docs not contain any message in- 
formation, it is sort of wasted. For quantum coding, the 
additional digit may really count, since it would have to 
be added each time a codeword is stored or tran smitted! 
Also the prefix space considered in section III B 2| contains 
messages which are not as short as possible. You can en- 
code a space V of dimension dim V = 4 by a prefix space 
spanned by {|000), |100), |110), |111)} with correspond- 
ing lengths {1,2,3,3}, but then you need a register of 3 
qbits. In contrast to that, V can be encoded by a neutral- 
prefix space spanned by the basis {|00), |01), |10), |11)} 
with corresponding lengths {0,1,2,2}, and you need a 
register of only 2 qbits. In the operational space Af r) the 
basis messages reveal their length information by simply 
discarding leading zeroes. That way, not all variable- 
length messages can be realized, but we save 1 register 
digit, so N r is a good candidate for variable- length quan- 
tum coding. 



IV. DATA COMPRESSION 

A. Classical data compression 

Intuitively, compression is achieved when the effort to 
store or communicate the codewords is minimized. But 
how can we precisely define that "effort"? The key idea 
is the concept of a raw code. One can always construct 
a code for Q by inventing a new letter for each single 
object. Such a classical raw code is a code c : fl — > A for 
some alphabet A of the same size as fi. The Chinese writ- 
ing is a fairly good illustration of a raw code. There are 



up to 50,000 letters representing a manifold of abstract 
and concrete things, e.g. the "noise of a running horse" . 
The length of the code is minimized to 1, but the encod- 
ing and decoding machines will need a large memory to 
remember all the letters. Obviously, a raw code does not 
compress at all, so it is a good idea to set the effort of 
communication in relation to the raw information con- 
tent of (similiar notion in |l4j| p. 71, and interestingly 
similiar also to the Boltzman entropy of a microcanonical 
ensemble), defined by 



Jo(fi) :=log 2 |n|. 



(45) 



Io{£l) represents the number of binary digits (bits) needed 
to enumerate the elements of f2. This motivates the fol- 
lowing definition. The code information content of an 
individual object in an arbitrary set Q for a given fc-ary 
code c : f2 — > A + is defined as 



I c (x) := log 2 k ■ L c (x), xeQ, 



(46) 



where L c (x) denotes the length of the codeword c(x) S 
A + . I c (x) represents the number of bits needed to de- 
scribe the object x by the code c. For a raw code 
c : fi — * A, definition ( |46"| ) gives the raw information 
content for every object x € Q. A few remarks about the 
code information: 

1) The code information is defined for things, not for 
strings. Of course, things may sometimes also be strings. 
If so, one can define the direct information of a string x n 
over an alphabet A as 



/(*") :=nlog 2 |.A|. 



(47) 



2) The code information I c is code dependent, reflect- 
ing the philosophy that there is no information contained 
in an object without a code giving it some meaning. The 
codeword "XWF$%&$ FggHz((" may be a random se- 
quence of letters or may in a certain code represent the 
first digits of 7r or in another code the beginning of a 
Mozart symphony. 

Now let there be a probability distribution p on fl. 
We can define the code information of the ensemble 
£ = {p, tt} as the average of (|46|), 



7 C (E) := log 2 k ^2 P( x ) M 3 



(48) 



Compression means reducing the code information of the 
ensemble. We can define the compression rate achieved 
by a code c on the ensemble £ by 



R C (S) 



Jc(S) 

i (ny 



(49) 



A code c : — > C is compressive on £ if and only if 

R C (E) < 1 i.e. J C (E) < io(0). (50) 
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B. Quantum data compression 

Now that we have a classical definition of compression, 
the next step is to translate these concepts to the quan- 
tum case. Again, the key is the raw information, i.e. the 
size of a non-compressed message, so let us look for its 
quantum analogue. The raw information (|45| ) of a set 
is 7 (f2) = l°g2 1^1 because we need |fi| distinct letters to 
encode each element of f2 by a raw code. Interpreting 
as an orthonormal basis for a Hilbert space V, the raw 
information of V is also log 2 |0|, because we still need 
distinguishable letters to represent each element of the 
space V. Since = dimV, we define the quantum raw 
information of a space V as 



7 (V) :=log 2 (dimV). 



(51) 



So the quantum raw information Iq corresponding to a 
space V equals the fixed number of qbits needed to rep- 
resent all states in V. 

Now, for a given &;-ary code c : V — > TL® represented 
by an encoder C, the code information operator can be 
defined as 



log 2 k ■ L c 



(52) 



where L c := C~ 1 LC is the length operator measuring 
the length of the codeword for a source vector in V. If the 
code is based on a qbit alphabet, I c measures the number 
of qbits forming the code message, hence the measuring 
unit of I c is "1 qbit". In analogy to (fl7|), we define the 
direct information operator acting on the message space 
H®by 



I := log 2 k ■ L. 



(53) 



In short, the code information operator is defined in an 
arbitrary Hilbert space V and depends on a quantum 
code c : V — > TL® , while the direct information operator 
is defined in a message space 7i® without referring to a 
quantum code. For a given code, the relation between 
both operators is 



7, = C- X IC. 



(54) 



Now you want to compress a codeword by removing re- 
dundant quantum digits. The number of quantum digits 
carrying information is given by the base length of the 
codeword. All other digits are redundant and can be re- 
moved without loss of information. This motivates the 
definition of the code information of a state \x) € V re- 
specting a code c by 



I c (x) := log 2 fc-7 c (a;), 



(55) 



where L^x) = L(c(x)) is the base length of the code- 
word for \x). L c (x) represents the number of qbits needed 
to describe the state \x) by the code c. This value 
must be distinguished from the expected number of qbits 



I c (x) = (x\I c \x) that is found by performing a length 
measurement on the codeword for \x). In the classical 
case, the difference vanishes. 

Now suppose you want to encode an ensemble £ = 
{p, X} of states |a;) € X that span the source space V. 
Each individual message \x) can be compressed to I_ c i x ) 
qbits, so the entire ensemble £ will on the average be 
compressed to the code information 



7 C (£) :=log 2 /c ]>>(x)L c ( 

xex 



The compression rate can then be defined by 



Rc(S) := 



io(vy 



(56) 



(57) 



A code c is compressive on the ensemble S, if and only if 

R c (£) < 1 i.e. 7 C (£) < 7 (V). (58) 

Note that these definitions only apply to lossless codes. 
The lossy case is not considered here. 

V. NO-GO THEOREMS 

Of course, lossy compression is always possible. But 
let us look for some statements about lossless codes. The 
first three of the following no-go theorems are also known 
in classical information theory and are easily transferred 
to the quantum case by general reasoning. However, we 
show them by applying the tools developped in this pa- 
per. The last theorem is genuinely quantum with no 
classical analogue. 

A. No lossless compression by block codes 

A code is a block code if all codewords have the same 
length, else it is a variable-length code. Unfortunately, 
lossless block codes do not compress. Take an arbitrary 
ensemble £ = {p, X} with X C V and any lossless fc-ary 
block code c : V — > H® n . Let By and B n be orthonor- 
mal basis sets of V and H 8 ", respectively. In order to 
find for every basis vector \uj) € By a code basis vector 
\c{uj)) € B n , the code must fulfill dim V < dimW®" = k n . 
For every |x) € X, the corresponding codeword |c(a;)) has 
sharp length L{x) — n, hence 

7 C (S) = log 2 k £ p(x) L c (x) = log 2 k ■ n = log 2 (fc") (59) 



>log 2 (dimV) = 7 (V), 



(60) 



which violates condition (p8|). This implies that there is 
no lossless compressing block code. By choosing mutu- 
ally orthogonal source states one can derive the analogue 
statement for the classical case. 
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For long strings emitted by a memoryless source, block 
codes can achieve almost lossless compression by encod- 
ing only typical subspaces. The quantum code perform- 
ing this type of lossy compression is known as the Schu- 
macher code p]. The only way to compress messages 
without loss of information is by use of a variable-length 
code. In order to achieve compression, more frequent ob- 
jects must be encoded by shorter messages, less frequent 
objects by longer messages, so that the average length of 
the codes is minimized. This is the general rule of lossless 
data compression. 



have a space TL® r of variable-length messages with max- 
imal length r. Assume that there is a universal lossless 
code c that reduces the length of all messages in 7i® r . 
The code can only be lossless if Aun.Ti® r < dim7i® s , 
which is obviously wrong for r > s, so you cannot com- 
press all variable-length messages with a given maximal 
length. Concluding, there is no universal lossless com- 
pression that reduces the size of all messages. Some mes- 
sages are unavoidably lengthened by a lossless code. By 
choosing mutually orthogonal source states, one can de- 
rive the analogue statement for the classical case. 



B. No lossless compression by changing the alphabet 

Trying to achieve compression by using a different al- 
phabet does not work. 

A code c : H® n — > TL% m that transforms messages 
over some letter space Ha into messages over some letter 
space Hb is lossless only if dimH®" < dimWf™, which 
implies that 

J (V) = nlog 2 (dimft A ) (61) 
< to log 2 (dimft B ) = IJx), (62) 

for every \x) € Ha- So for every ensemble £ = {p, X} 
of messages \x) £ W™, we have / C (S) = L c ( x ) > ^o(V), 
which violates condition (|58|). By choosing mutually or- 
thogonal source states, one can derive the analogue state- 
ment for the classical case. This paper looks probably 
much shorter when written in Chinese symbols. However, 
the effort of communication that is expressed by the code 
information 7 C , would not be reduced. 



C. No universal lossless compression 

We have seen that it is not possible to compress mes- 
sages without loss of information by using a block code 
or by using a different letter space. Now we will see that 
no code can compress all messages without loss of infor- 
mation. 

Say you have a space H® n of block messages of fixed 
length r and you want to compress all of them by use of 
a variable-length code c : H® r — > H® s with s < r. The 
code can only be lossless if 

dimft 8r < dimH® 3 . (63) 
But since dimH® r = k r and dim7Y ffis = fc °^~ 1 , we have 

V < (64) 
=*> k r+1 < k s+1 +k-l (65) 

which is wrong for r > s and k > 1, so you cannot com- 
press all block messages of a given length. Now say you 



D. No lossless compression of unknown messages 



Now we come to a no-compression theorem that is typ- 
ically quantum. In quantum mechanics there is a pro- 
found difference between a known and an unknown state. 
For example, a known state can be cloned (by simply 
preparing another copy of it), whereas an unknown state 
cannot be cloned. 

Assume that there is a lossless quantum compression 
algorithm c : H® r — > H® s that compresses messages of 
fixed length r to variable-length messages of maximal 
length s. As we have seen in the last section, a loss- 
less code cannot compress all messages, so s > r. Now 
there is an oracle that hands you an arbitrary message 
\x) = 53j=i x i where the € H® r are mutually 
orthogonal states. The algorithm encodes the message 
|ir) into \c(x)} = Xi \ c ^i))- E ven if a U the code- 

word components \c(uji)) have determinate length L c (uii), 
the total codeword |c(a;)) has in general indeterminate 
length. If you want to remove redundant digits with- 
out loss of information, you must know at least an upper 
bound for its base length, i.e. the length of its longest 
component. Since you do not know the source message 
\x), you do not know the base length of its encoding 
|c(a;)), so you have to assume the maximal length s. 
Since s > r, no compression is achieved. The same 
argument applies to quantum compression algorithms 
c : H (Br — > Ti® s compressing variable-length messages 
of maximal length r to variable-length messages of max- 
imal length s. 

We conclude that lossless compression of unknown 
quantum messages is in general impossible. This state- 
ment is not true for the classical case. A classical message 
is not disturbed by a length measurement, so it can in 
principle be compressed without loss of information. It 
would have been nice to compress a quantum hard disk 
without loss of information just like a classical hard disk, 
but this cannot be accomplished in general. 

Now that we have found a lot of impossible things to do 
with quantum messages, it is time to look for the possible 
things. 







VI. LOSSLESS COMPRESSING CODES 

The intention of using compressing codes is to mini- 
mize the effort of communication between two parties: 
one who prepares, encodes, compresses and sends the 
messages and one who receives, decompresses, decodes 
and possibly reads them. So it's time for Alice and Bob 
to enter the scene. Alice is preparing source messages 
\x) G V and encodes them into codewords \c(x)} € H® r 
by applying the encoder C. She compresses the code- 
words by removing redundant quantum digits and sends 
the result to Bob, who receives them and decompresses 
them by appending quantum digits. After that he can de- 
code the messages by applying the decoder D and read 
them or use them as an input for further computations. 
The communication has been lossless, if the decoded mes- 
sage equals the source message. Note that it is not re- 
quired for Bob to read the message he received! In fact, 
if Bob wants to use the message as an input for a quan- 
tum computer, he even must not do that, else he will 
potentially lose information. We require Alice to know 
which source messages she prepares, otherwise no lossless 
compression is possible, as we have seen in the previous 
section. 



A. Why prefix quantum codes are not very useful 

In classical information theory, prefix codes are favored 
for lossless coding. The reason is that they are instan- 
taneous, which means that they carry their own length 
information (see section IIIB2). Prefix codewords can 
be sent or stored without a separating signal between 
them. The decoder can add word separators ( "commas" ) 
while reading the sequence from left to right. Whenever 
a string of letters yields a valid codeword, the decoder 
can add a comma and proceed. After all, a continuous 
stream of letters is separated into valid codewords. 

Prefix codewords can be separated while reading the 
sequence, but in the quantum case this is potentially a 
very bad thing to do. Reading a stream of quantum 
letters means in general disturbing the message all the 
time. Therefore, the length information is generally not 
available. Furthermore, prefix codewords are in general 
longer than non-prefix codewords, because there are less 
prefix codewords of a given maximal length than possi- 
ble codewords. Hence, by using prefix codewords qbits 
are wasted to encode length information which is unavail- 
able anyway. We conclude that prefix quantum codes are 
practically not very useful. 



that does not fix the problem. Whatever one does, read- 
ing out length information about different components of 
a variable-length codeword equals a length measurement 
and hence means disturbing the message. Though there 
should be some way to make sure where the codewords 
have to be separated, else the message cannot be decoded 
at all. Here is an idea: Use a classical side-channel to 
inform the receiver where the codewords have to be sep- 
arated. This has two significant advantages: 

• If the length information equals the base length of 
the codeword, the message is not disturbed and can 
be losslessly transmitted and decoded. 

• Abandoning the prefix condition, shorter code- 
words can be chosen, such that the quantum chan- 
nel is used with higher efficiency. 



Quantum 
Channel 



|1001101) 


111) 


|10) 


+ |1101) 


+ |1011) 


+111} 


+ |10) 


+ |11101) 


+11} 


7 


5 


2 



B. A classical side-channel 



Classical 
Channel 



FIG. 5. Storing length information in a classical 
side-channel. 

Let us give an example (see Fig. ||). Alice wants to 
send a message |a;i) which is encoded into the codeword 
|c(xi)) = ^=(|1001101) + |1101) + |10)). The base length 

of |c(a;i)) is 7, so she submits that information through 
the classical channel. Dependent on which realization of 
variable-length messages Alice and Bob have agreed to 
use, Alice sends enough qbits (at least 7) representing 
the codeword |c(xi)) through the quantum channel. The 
next codeword is \c(x 2 )) = Tjd 11 } + |10H) + |1H01)). 
The base length of \c(x2)) is 5, so Alice sends the 
length information "5" through the classical channel 
and enough qbits (at least 5) representing the codeword 
\c{x2)) through the quantum channel. She proceeds like 
that with all following messages. On Bob's side, there is a 
continuous stream of qbits coming through the quantum 
channel and a continuous stream of classical bits com- 
ing through the classical channel. Bob can read out the 
classical length information, separate the qbits into the 
specified blocks and apply the decoder to each codeword. 
After all, Bob obtains all source messages without loss of 
information. 



C. How much compression? 

1. Lower bound 



One could try to encode length information in a differ- 
ent quantum channel, as proposed by Braunstein et al. || 
(unnecessarily they used prefix codewords anyhow). But 



How much compression can maximally be achieved by 
using the method sketched in section VI B? Say Alice has 
an ensemble E = {p, X} of m = \X\ messages \xt) € X, 
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i = 1, . . . , m that she wants to encode by fc-ary code- 
words. The source space V is spanned by the elements of 
X, thus V := Span(A"), and has dimension d := dimV. 
Alice fixes a basis set B\> of d orthonormal vectors |cjj), 
i = 1, . . . , cL The ensemble S corresponds to the message 
matrix 

m 

cr := y^^pjxj) \Xi)(xi\ = ^ ( 66 ) 
i=l i,J=l 

with Oij :— (ui\a\uj) and J2i=i a a = 1- The source mes- 
sages are encoded by the isometric map c : V — > H®, 
defined by 

^ IcK)), i=l,...d. (67) 

The code space is fc-ary, which means that fc = dimTt. 
Let each codeword \c(uji)) have determinate length 
L c {u>i), such that the code length operator L c on V is 
orthogonal in the basis B\> and reads 

d 

L c = ^L c (uji)\uji}(uji\. (68) 

i=l 

The codewords |c(a;j)) are not necessarily prefix, because 
Alice can encode the length information about each code- 
word in a classical side-channel. In order for the trans- 
mission to be lossless, she has to transmit the base length 
L c (xi) of each codeword corresponding to the source mes- 
sage \xi). The base length is at least as long as the ex- 
pected code length of the codeword, hence 

LM > (xi\L c \ Xi ). (69) 

Now we are interested in the average base length, since 
this determines the compression rate. The average base 
length is bounded from below by 

n i 

L c (T) = Y,P(xi)L< s {x i ) (70) 

i=l 
m 

> Y,P( x i) (xi\Lc\xi) = Tr{aL c } (71) 

i=i 

III 

= 22 a ii L c(Ui)- (72) 
i=l 

Now we perform the following trick. As already stated, 
non-prefix codewords can be chosen shorter than (or at 
most as long as) prefix codewords. Consider an arbitrary 
prefix code c', then 

L c i(uji) = L c (uji) + l c '(ui) > L c (uji), (73) 

where l c > (wi) > is the length difference between the pre- 
fix and the non-prefix codeword for |u>j). Prefix codes, 



just like all uniquely decodable symbol codes, have to 
fulfill the Kraft inequality 0@ 

d 

^fc-^'K) < 1. (74) 
i=i 

Since the code length operator L c / is orthogonal in the 
basis By, we can express the above condition by the quan- 
tum Kraft inequality 

Tr v {fc-^'}<1, (75) 

where L c i := L c + l c ' and 

d 

l C ' '■= Ic'jui) (76) 
i=i 

The quantum Kraft inequality was derived for the first 
time by Schumacher and Westmoreland [||. Here, the 
quantum Kraft inequality requires that 

d 

Q:=J2 fc-M"*)-**'^) < 1. (77) 

i=l 

Now define implicit probabilities 

q( Ui ) := Ife-M-O-ic'C-O, ( 78 ) 

which can be rewritten as 

Lcfa) = - log fe g(w<) - log fe Q - l'(ui). (79) 
Summing over the an yields 

d d 

^ o-u L c (u>i) = a u log*. q{u)i) - \og k Q - I', (80) 

i=l 8=1 

where 

d 

I' := J o- u I c ,{lo % ) = Tr{(T t c ,} (81) 

i=l 

is the average additional length. The inequality ((7^) can 
now be expressed by 

d 

L C (E) > - CT « l °Sk <zK) - logfe Q-l' (82) 

i=l 

Gibbs' inequality implies that 

d 

LJ£) > - Y en log fc <tu - log fc Q-l'. (83) 

»=i 

The von-Neumann entropy of the message matrix a can- 
not decrease by a non-selective projective measurement 
in the basis B\>, hence 

S(a) < S(a'), (84) 
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where 



D. Quantum Morse codes 



a' := ^ \vi)(uJi\o-\uii)(uJi 



i=l 



<Ju\Ui){uJi 



(85) 



Since 



S(a') = -y g u log 2 an = - log 2 k ^ a a lo Sfc ( 86 ) 



relation (84) states that 
d 



^] era log, an > 



\og 2 k 



S(a). 



(87) 



Using (|87|) together with the Kraft inequality Q < 1, 
relation (|83|) transforms into 

log 2 k ■ {L C (E) + I'} > 5(a) - log, Q > S(a). (88) 

Recalling the definition of the code information (|5^) and 
defining the length information that can be drawn into 
the classical side-channel by 

I' := log 2 k ■ I', (89) 

we finally arrive at the lower bound relation 

I c (E)+/'>S(a). (90) 

If c is a uniquely decodable symbol code, e.g. a prefix 
code, we have I' — 0. Inequality (90) states that the 
ensemble E can be losslessly compressed not below S(a) 
qbits. However, by drawing length information into a 
classical side-channel it is possible to reduce the average 
number of qbits passing through the quantum channel be- 
low the von-Neumann entropy. We will give an example 
later on where this really happens. 



2. Upper bound 

Let us look for an upper bound for the compression 
that can be achieved. In order to encode every source 
vector in V by a fc-ary code, we need at most 

L c (x) < \log k (dim V)] < log, (dim V) + 1 (91) 

digits. Using log a x = log a b ■ log b x, we have 

I C (E) <log 2 (dimV)+log 2 fc. (92) 

This upper bound is neither very tight nor is it related to 
the von-Neumann entropy. However, our efforts to find 
a more interesting upper bound were not successful. It 
remains an open question to find such a bound and hence 
a quantum mechanical generalization to Shannon's the- 
orem [na, 



#(E) < / C (E) < + log 2 k, 



(93) 



which looks more familiar for k — 2, such that log 2 k = 1 
and 7 C (E) = L C (E). 



One way to avoid a classical side-channel is to leave a 
pause between the quantum codewords, which equals an 
additional orthogonal "comma state" . Such a code is a 
quantum analogue to the Morse code, where the code- 
words are also separated by a pause, in order to avoid 
prefix codewords. Of course, the codewords plus the 
pause are prefix. Due to the close analogy one could 
speak of quantum Morse codes. Here, the information 
I' needed for the comma state is independent from the 
statistics, because the comma state must be sent after 
each letter codeword, no matter which one. In contrast 
to that, I' is in general dependent from the statistics. 
If one transmits the length of each codeword through a 
classical side-channel, one can use a Huffman code to find 
shorter codewords for more frequent length values. Such 
is done in the following compression scheme. 



VII. A LOSSLESS COMPRESSION SCHEME 



Let us construct an explicit coding scheme that realizes 
lossless quantum compression. 



A. Preparations 



Alice and Bob have a quantum computer on both sides 
of the channel. They both allocate a register of r k- 
ary quantum digits, whose physical space is given by 
1Z = T)® r with T> — <C k . They a gree to use neutral-prefix 
codewords (see section [II B 3) to implement variable- 
length coding, hence the message space is J\f r of di- 
mension k r and is physically realized by the operational 
space Af r = 1Z- Alice is preparing source messages 
\xi),i = 1, ... ,m from a set X. The space spanned 
by these messages is the source space V = Span(A"). 
Alice prepares each message \x) € X with probability 
p(x), which gives the ensemble E := {p 7 X}. She en- 
codes the source messages into variable-length codewords 
|c(a;)) £ Af r of maximal length r. If the dimension of V is 
given by d := dim V, then the length of the register must 
fulfill 



(94) 



If the set X is linearly dependent, Alice creates a set 
X = X, removes the most probable message from X and 
puts it into a list M. Next, she removes again the most 
probable message from X, appends it to the list M and 
checks if the list is now linearly dependent. If so, she 
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removes the last element from M again. Then she pro- B. Communication protocol 

ceeds with removing the next probable message from X 
and appending it to M, checking for linearly dependence, 
and so on. In the end she obtains a list 



M = (\ Xl },... ,\x d )) 



(95) 



of linearly independent source messages from X, ordered 
by decreasing probability, such that p(xt) > p{xj) for 
i < j. She performs a Gram- Schmidt orthononormal- 
ization on the list M , giving a list B of orthornormal 
vectors defined by 



ki> := \xi), 



(96) 



\ui) := A, [l-^KXwjljki), (97) 

with i = 2, . . . , d and suitable normalization constants 
Ni. The elements of B form an orthonormal basis £>y 
for the source space V. Now she assigns codewords 

\c{uj i )):=\Zl{i-l)), i = l,...,d. (98) 

of increasing significant length 

L c ( Wi )=riog fc (i)l. (99) 

Note that the first codeword is the empty message \0) = 
l^fe(O)) — |0 - - - 0) , which does not have to be sent 
through the quantum channel at all. Instead, nothing 
is sent through the quantum channel and a signal repre- 
senting "length 0" is sent through the classical channel. 
Alice implements the encoder 



Alice prepares the message \x) G X and applies the 
encoder C to obtain \c(x)}. She looks up the correspond- 
ing code base length L c (x) in the table. If £ c (x) < r, 
she truncates the message to L c (x) digits by removing 
r — L_c{x) leading digits. She sends the £ c (x) digits 
through the quantum channel and the length information 
L c (x) through the classical channel. Then she proceeds 
with the next message. 

For any message \x) Alice sends, Bob receives the 
length information L c (x) through the classical channel 
and i c (x) digits through the quantum channel. He adds 
r — L c (x) quantum digits in the state |0) at the beginning 
of the received codeword. He then applies the decoder 
D and obtains the original message \x) with perfect fi- 
delity. Note that Alice can send any message from the 
source message space V, the protocol will ensure a loss- 
less communication of the message. For such arbitrary 
messages, however, compression will in general not be 
achieved, since the protocol is only adapted to the par- 
ticular ensemble S. Also, Bob can as well store all re- 
ceived quantum digits on his quantum hard disk and the 
received length information on his classical hard disk, 
and go to bed. The next day, he can scan the classical 
hard disk for length information and separate and decode 
the corresponding codewords on the quantum hard disk. 
The protocol works as well for online communication as 
for data storage. 



C :=^2 l c (^))(^ 



(100) 



by a gate array on 1Z. Then she calculates the base 
lengths of the codewords, 

L c {x) = max {L c (w 4 ) | |(^|a;)| 2 > 0}, (101) 

z=l,... ,d 



C. An explicit example 



for every message \x) € X and writes them into a ta- 
ble. The classical information is compressed using Huff- 
man coding of the set of distinct base length values 
C = {L c (u>i), . . . , L c (u>d)}- Alice constructs the Huffman 
codeword to each length I e C appearing with probability 



Pi = 



(102) 



x: L_ c (z 



Alice and Bob want to communicate vectors of a 4- 
dimcnsional Hilbcrt space V = Span{|0), |1), |2), |3)}, 
where we use the row notation in the following. Alice 
decides to use the (linearly dependent) source message 
set 



and writes them into a table. At last, Alice builds a gate 
array realizing the decoder D = C _1 and gives it to Bob. 
For the classical channel she hands the table with the 
Huffman codewords for the distinct lengths to Bob. Now 
everything is prepared and the communication can begin. 



X = {\a), \b), \c), \d), \e), |/), \g), \h), \i), \j)}, (103) 



whose elements are given by 
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|o) = 


^(1,1,1,1) 


(104) 


\b) = 


^(1,2,1,1) 


(105) 


\c) = 




(106) 


\d) = 


^=(1,4,1,1) 


(107) 


\e) = 


^(1,0,1,0) 


(108) 


l/> = 


-^(2, 0,1,0) 


(109) 


1.9) = 


5(3, 0,1,0) 


(110) 


\h) = 


-1(0,1,0,1) 


(111) 




— !— (0 2 n 1 \ 


fii2l 


li) = 


5(0,3,0,1) 


(113) 
(114) 



and which are used with the probabilities 

p(o) = 0.6, p(b) = p(c) = p(d) = 0.1, (115) 
p(e) = ...=p(j) = ^-. (116) 

The Shannon entropy of the ensemble X = {p, is 

#(£) = 2.02945, (117) 

and the classical raw information ( f45j ) reads 

Jo(^) = log 2 \X\ = 3.32193, (118) 

which gives an optimal classical compression rate of 
R = H/Iq = 0.610924. If Bob knows Alice's list of possi- 
ble messages, then this rate could in the optimal case be 
achieved by pure classical communication. However, Bob 
does not know the list and classical communication is not 
the task here. The message matrix a — ^2 xeX p{x)\x) (x\, 
given by 



/0.214549 0.224624 0.197882 0.177882X 

0.224624 0.40302 0.224624 0.244624 1 

0.197882 0.224624 0.191216 0.177882 

Vo.177882 0.244624 0.177882 0.191216/ 



has von-Neumann entropy 

S(a) = 0.571241. 



(119) 



(120) 



Let the quantum channel be binary, i.e. let k = 2. The 
codewords are constructed along \c(u>i)) = \Z%{i — 1)), 
yielding the variable-length states 



|c(wi)) = |0) 

kM) = |i> 
IcM) = |io) 
IcM) = |n), 



(125) 
(126) 
(127) 
(128) 



that span the code space C. In a neutral-prefix code they 
are realized by the 2-qbit states 



\c(lo x )) = |00) 

|c(u/ 2 )) = |01) 

|cM) = |io) 

|c(w 4 )) = |H> 



(129) 
(130) 

(131) 
(132) 
(133) 



that span the operational code space C, which is a sub- 
space of the physical space 1Z — (D 2 ® CJ 2 . Alice realizes 
the encoder C : V — * C, C = J2i |c( w i))( w i|> given by 



C = 



( 0.5 0.5 0.5 0.5 

-0.288675 0.866025 -0.288675 -0.288675 

0.408248 0.408248 -0.816497 

V 0.707107 -0.707107 



(134) 



and the decoder D = C 1 , given by 



D = 



/0.5 0.408248 -0.288675 0.707107 

0.5 0.866025 

0.5 0.408248 -0.288675 -0.707107 

Vo.5 -0.816497 -0.288675 



(135) 



The orthogonalization procedure yields the basis B\> = 
{\u>i)} with 

= (0.5,0.5,0.5,0.5) (121) 
\uj 2 ) = (-0.288675,0.866025,-0.288675,-0.288675) (122) 
|w 3 ) = (0.408248, 0, 0.408248, -0.816497) (123) 
|w 4 ) = (0.707107, 0, -0.707107, 0). (124) 



by gate arrays and gives the decoder to Bob. The en- 
coded alphabet is obtained by \c(x)) = C\x). Alice writes 
the base lengths of the codewords 

L c (a) = 0, L c (b) = L c (c) = L c (d) = 1, (136) 
L c {e) = ... = L c {j) = 2 (137) 

in a table and calculates the corresponding probabilities 

p = 0.6, pi=0.3, p 2 = 0.1 (138) 

She constructs Huffman codewords for each length 

c = 1, c x = 01, c 2 = 00, (139) 

such that the average bit length is 



= = 1.4, 



(140) 



2=0 
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which is the optimal value next to the Shannon entropy 
of the length ensemble 



2 

■E 

(=0 



pi log 2 pi = 1.29546 



(141) 



Alice hands the table with the Huffman codewords to Bob 
and tells him that he must listen to the classical channel, 
decode the arriving Huffman codewords into numbers, 
receive packages of qbits, whose size corresponds to the 
decoded numbers, and add to each package enough lead- 
ing qbits in the state |0) to end up with 2 qbits. Then 
he must apply the decoder D to each extended package 
and he will get Alice's original messages. 

Say, Alice wants to send the message \a). She pre- 
pares | a) and applies the encoder C to obtain the code- 
word 1 00). She looks up the corresponding base length 
L c (a) = and truncates the codeword to L c (a) = qbits. 
In this case there are no qbits left at all, so she sends 
nothing through the quantum channel and the Huffman 
codeword for "length 0" through the classical channel. 
Bob receives the classical length information "0" and 
knows that nothing comes through the quantum chan- 
nel and that in this case he has to prepare 2 qbits in 
the state 1 00} . He applies the decoder D and obtains Al- 
ice's original message \a). In order to send message |6), 
Alice truncates the codeword to L c (b) — 1 qbit and ob- 
tains ^75 (|0) + |1)). She sends the qbit through the quan- 
tum channel together with the classical signal "length 
1". Bob receives the length message and knows that he 
has to take the next qbit from the quantum channel and 
that he has to add 1 leading qbit in the state |0). He 
applies D and obtains Alice's original message \b). The 
whole procedure works instantaneous and without loss 
of information. We have implemented the above example 
by a Mathematica™ program and numerical simulations 
show that the procedure works fine and the specified com- 
pression of quantum data is achieved. (You can find the 
program and the package at |l6|]). 

Let us look for the compression that has been achieved. 

a. The quantum code information, i.e. the average 
number of qbits being sent through the quantum channel, 



Ic= 5>0&)£c0»0 = 0.5, 



(142) 



falls below(!) the von-Neumann entropy: 

l c < S = 0.571241. (143) 
Suc h a beh aviour has already been suspected in sec- 



tion |VI_C1. 

b. The quantum raw information, i.e. the size of the 
non-compressed messages, is given by 

I c <7 = log 2 (dimV) = 2, (144) 

hence the compression rate on the quantum channel reads 

L 



0.25. 



(145) 



In other words, the number of qbits passing through the 
quantum channel is reduced by 75 %. Sending 100 mes- 
sages without compression requires 200 qbits. Using the 
compression scheme, Alice typically sends 50 qbits. 

c. The sum of both quantum and classical informa- 
tion, 



Itot=L + l' = 1-79546, 



(146) 



is smaller than the Shannon entropy flllTj ) of the original 
ensemble E: 



7 tot < H = 2.02945. 



(147) 



Thus it is better to use the quantum compression scheme 
than to simply tell Bob on the phone which state he must 
prepare. As already suspec ted, /tot is still greater than 
the von-Neumann entropy ( |120j ), 



7 tot > S = 0.571241. 



(148) 



The classical part of the compression depends on the al- 
gorithm. Only in the ideal case the information can be 
compressed down to the Shannon entropy of the length 
ensemble, given by /'. Using the Huffman scheme, the 
average length L' = 1.4 represents the information that 
is effectively sent through the classical channel, such that 
the total effective information is given by 



h S = Lc 



L' = 1.9. 



(149) 



d. 

reads 



The the total compression rate of both channels 



V 



0.897731 < 1, 



(150) 



where it is assumed that the information on the classical 
channel can be compressed down to its Shannon entropy 
V . Using the Huffman scheme (as we have done in our 
example), the information on the classical channel can 
only be compressed to L' > I', such that the effective 
total compression rate is given by 



R, 



off : 



L + L' 
la 



0.95 < 1. 



(151) 



Thus in any case there is an overall compression. For 
higher dimensional source spaces (hence more letters), 
the compression is expected to get better (provided the 
letter distribution is not too uniform). However, the 
numerical effort for higher dimensional letter spaces in- 
creases very fast and we want to keep the example as 
simple as possible. 
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VIII. CONCLUDING REMARKS 

We have developped a general framework for variable- 
length quantum messages and defined an observable mea- 
suring the quantum information content of individual 
states by the number of qbits needed to represent the 
state by a given code. We derived some basic state- 
ments about lossless compression. In particular, we have 
demonstrated that a quantum message can only be com- 
pressed without loss of information if the source mes- 
sage is a priori known to the sender. On these grounds, 
we have worked out a lossless and instantaneous quan- 
tum data compression protocol. One can object that 
there is no use in compressing quantum states that are 
already known to the sender, because then Alice could as 
well tell Bob classically which of the quantum states she 
wants to communicate. However, such a pure classical 
communication would require Bob to have a list of pos- 
sible messages Alice may send. Moreover, for arbitrary 
quantum messages from the source space, Alice would 
need infinitely many bits to communicate them through 
a classical channel to Bob. In contrast to that, in our 
communication scheme Alice can send arbitrary messages 
from the source message space, but she must know which 
message she is going to send to get the base length. Bob 
needs only the decoder and the user instructions for the 
classical channel, then he can reobtain Alice's original 
messages with perfect fidelity. The protocol can individ- 
ually be adapted to a given message ensemble, such that 
compression is achieved for that ensemble. 



IX. OPEN QUESTIONS 

It would be satisfying to find an optimal compress- 
ing lossless quantum code with a tight upper bound re- 
lated to the von-Neumann entropy. This would represent 
a quantum analogue to Shannon's relation (93). There 
might be interesting applications to quantum cryptogra- 
phy. By combining the methods of quantum cryptogra- 
phy with the methods of lossless compression, the effi- 
ciency of secure data transfer may possibly be increased. 
Furthermore, it would be interesting to see how the 
framework of variable-length messages applies to quan- 
tum computation, since the data stored in the register of 
a quantum computer could also be regarded as a variable- 
length quantum message. One could also think about 
variable- length quantum error-correcting codes. We hope 
that the presented work stimulates some more discussion 
and theoretical research on variable-length quantum cod- 
ing and its applications. 
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